Portable Option Discovery for Automated Learning Transfer in Object-Oriented Markov Decision Processes

نویسندگان

  • Nicholay Topin
  • Nicholas Haltmeyer
  • Shawn Squire
  • Robert John Winder
  • Marie desJardins
  • James MacGlashan
چکیده

We introduce a novel framework for option discovery and learning transfer in complex domains that are represented as object-oriented Markov decision processes (OO-MDPs) [Diuk et al., 2008]. Our framework, Portable Option Discovery (POD), extends existing option discovery methods, and enables transfer across related but different domains by providing an unsupervised method for finding a mapping between object-oriented domains with different state spaces. The framework also includes heuristic approaches for increasing the efficiency of the mapping process. We present the results of applying POD to Pickett and Barto’s [2002] PolicyBlocks and MacGlashan’s [2013] Option-Based Policy Transfer in two application domains. We show that our approach can discover options effectively, transfer options among different domains, and improve learning performance with low computational overhead.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automated Discovery of Options in Factored Reinforcement Learning

Factored Reinforcement Learning (FRL) is a method to solve Factored Markov Decision Processes when the structure of the transition and reward functions of the problem must be learned. In this paper, we present TeXDYNA, an algorithm that combines the abstraction techniques of Semi-Markov Decision Processes to perform the automatic hierarchical decomposition of the problem with an FRL method. The...

متن کامل

Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs

Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...

متن کامل

مدل‌سازی تاثیرات پسروی دریاچه ارومیه بر روستاهای ساحل شرقی دریاچه ارومیه با پردازش شیءگرای تصاویر ماهواره‌ای

Urmia Lake is one of the largest hyper saline lakes in the world and largest inland lake in Iran which located in the north west of Iran, between the provinces of East Azerbaijan and West Azerbaijan. The lake basin is one of the most influential and valuable aquatic ecosystems in the country and registered as UNESCO Biosphere Reserve. In addition, it is very important in terms of water resource...

متن کامل

Automated Tumor Segmentation Based on Hidden Markov Classifier using Singular Value Decomposition Feature Extraction in Brain MR images

ntroduction: Diagnosing brain tumor is not always easy for doctors, and existence of an assistant that                                                      facilitates the interpretation process is an asset in the clinic. Computer vision techniques are devised to aid the clinic in detecting tumors based on a database of tumor c...

متن کامل

Accelerated decomposition techniques for large discounted Markov decision processes

Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015